Measuring the Quality of Pronunciation Dictionaries
نویسندگان
چکیده
In this paper we investigate measures for the evaluation of pronunciation dictionaries that can be used independently of the type of lexicon, the language, a specific recognizer and how the dictionary was generated. We will describe statistical measures, measures based on information theory and performance measures and give examples how these measures can be practically applied in supervision of data-driven dictionary training, selection of pronunciation variants and evaluation of the consistency of different dictionaries. Although the introduced measures are independent of the type of dictionary, we only report results obtained with a datadriven dictionary generation and do not address measures specific to rule-based approaches.
منابع مشابه
Computer-Aided Quality Assurance of an Icelandic Pronunciation Dictionary
We propose a model-driven method for ensuring the quality of pronunciation dictionaries. The key ingredient is computing an alignment between letter strings and phoneme strings, a standard technique in pronunciation modeling. The novel aspect of our method is the use of informative, parametric alignment models which are refined iteratively as they are tested against the data. We discuss the use...
متن کاملMeasuring the Confusability of Pronunciations in Speech Recognition
In this work, we define a measure aimed at assessing how well a pronunciation model will function when used as a component of a speech recognition system. This measure, pronunciation entropy, fuses information from both the pronunciation model and the language model. We show how to compute this score by effectively composing the output of a phoneme recognizer with a pronunciation dictionary and...
متن کاملFunctional Pronunciation Dictionaries
This document describes the system Functional Pronunciation Dictionaries (FPD), a language-independent system for defining pronunciation lexicons. The starting point of this system is Functional Morphology (FM) [3, 2], where we asked us the question — given that we already have defined a lexical resource in FM, what would be an efficient approach to extending the lexicon with high-quality pronu...
متن کاملEfficient compression method for pronunciation dictionaries
Pronunciation dictionaries are often used with other datadriven methods to model the pronunciations in phonemebased automatic speech recognition (ASR) and text-to-speech (TTS) systems. The dictionaries usually take a great amount of memory, which is a limiting factor in portable handheld devices. Compressing the pronunciation dictionaries results in minimal transmission bandwidth and less stora...
متن کاملGlobalPhone: Pronunciation Dictionaries in 20 Languages
This paper describes the advances in the multilingual text and speech database GLOBALPHONE a multilingual database of high-quality read speech with corresponding transcriptions and pronunciation dictionaries in 20 languages. GLOBALPHONE was designed to be uniform across languages with respect to the amount of data, speech quality, the collection scenario, the transcription and phone set convent...
متن کامل